Skip to content

QVAC-12239 feat[api]: add qvac doctor command and system-requirements doc#1681

Merged
simon-iribarren merged 15 commits into
tetherto:mainfrom
simon-iribarren:feat/qvac-12239-system-requirements-checker
Apr 28, 2026
Merged

QVAC-12239 feat[api]: add qvac doctor command and system-requirements doc#1681
simon-iribarren merged 15 commits into
tetherto:mainfrom
simon-iribarren:feat/qvac-12239-system-requirements-checker

Conversation

@simon-iribarren

@simon-iribarren simon-iribarren commented Apr 21, 2026

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Today users only find out about missing system dependencies (wrong Node version, missing ffmpeg, unsupported platform, too little RAM / disk, @qvac/sdk not installed, …) through runtime errors deep in the SDK. There is no upfront validator and no published system-requirements.md to read before installing.

This is the pitch item "Add a system requirements checker (e.g. bun run check-system or integrated into CLI) that validates system dependencies before runtime" from the SDK DX & Tooling pitch (QVAC-12239). We ship it under the doctor verb to match brew doctor / flutter doctor / npm doctor, and to give us room to grow host + installed-model checks under one command without a later breaking rename.

How does it solve it?

  • New CLI subcommand qvac doctor that runs a preflight against the host:
    • Runtime — Node.js version (>= 18, warns on < 20), platform/arch against the supported matrix from DEFAULT_HOSTS.
    • Hardware — total RAM, free RAM, free disk space in the working directory (via fs.statfsSync, gracefully skipped on Node < 19.6).
    • Optional toolsffmpeg (transcription/mic), Bare runtime, Bun. All warn-only.
    • Project — whether @qvac/sdk is resolvable from node_modules.
  • Human-readable output by default, --json for CI/scripts, --quiet for exit-code-only usage. Exit 0 on success, 1 when any required check fails.
  • New packages/cli/system-requirements.md documenting the full matrix and JSON schema (this closes the "No system-requirements.md file" bullet in the Asana task).
  • README updated with a Command Reference entry and a System Requirements section.
  • Unit tests cover every pure check function (26 cases). Bats adds two smoke tests for the help output and JSON validity.
  • Bumped @qvac/cli to 0.3.0 (new backward-compatible feature).

New API

# Human-readable preflight report
qvac doctor

# JSON for CI / scripts
qvac doctor --json

# Fail-fast in a script
qvac doctor --quiet || exit 1

Sample output on a supported host:

🩺 QVAC doctor

  Host: darwin-arm64, Node 20.19.5

Runtime:
  ✅ Node.js version — v20.19.5
  ✅ Platform / architecture — darwin-arm64

Hardware:
  ✅ Total RAM — 24.00 GB
  ✅ Free RAM — 6.12 GB
  ✅ Free disk space (…) — 239.22 GB

Optional tools:
  ✅ ffmpeg — ffmpeg version 8.1 …
  ✅ Bare runtime — v1.24.2
  ✅ Bun — 1.2.21

Project:
  ✅ @qvac/sdk in node_modules — v0.8.0

✅ All required checks passed.

JSON shape (see system-requirements.md for the full schema):

interface DoctorReport {
  ok: boolean;
  platform: string;
  arch: string;
  nodeVersion: string;
  sections: Array<{
    id: 'runtime' | 'hardware' | 'tools' | 'project';
    title: string;
    checks: Array<{
      id: string;
      label: string;
      status: 'pass' | 'warn' | 'fail' | 'skip';
      severity: 'required' | 'recommended';
      value?: string;
      hint?: string;
    }>;
  }>;
}

How was it tested?

  • bun run typecheck — clean.
  • bun run test:unit — 26/26 pass (covers all pure check functions + injected probes for missing/present ffmpeg/bare/bun, SDK present/absent on disk, threshold behavior for Node version, RAM, and the supported host matrix).
  • bun run test:bats — 36/36 pass (34 pre-existing + 2 new: help output, --json validity).
  • Manual smoke of qvac doctor and qvac doctor --json on macOS arm64 with Node 20.19.5.

Out of scope

  • Installed-model validation — the pitch hints that doctor should eventually also "validate installed models". We ship host-only checks here and leave on-disk model validation for a follow-up so this first version stays scoped. The doctor name intentionally leaves room for that to land without another rename.
  • KV-cache / per-model memory validation — tracked separately as QVAC-11621 / PR QVAC-11621 fix: KV cache memory validation to prevent OOM crashes #1642.
  • Integrating the checker into qvac bundle sdk / qvac serve openai startup — possible follow-up; out of scope here to keep the PR small.
  • Raising engines.node above 18 — free disk check silently skips on < 19.6.

Related

@simon-iribarren simon-iribarren requested review from a team as code owners April 21, 2026 08:31
@simon-iribarren simon-iribarren changed the title QVAC-12239 feat[api]: add qvac check-system command and system-requirements doc QVAC-12239 feat[api]: add qvac doctor command and system-requirements doc Apr 21, 2026
@simon-iribarren simon-iribarren force-pushed the feat/qvac-12239-system-requirements-checker branch 2 times, most recently from 91f333c to d0a5699 Compare April 22, 2026 13:40
…(QVAC-12239)

Validates host preconditions (Node version, platform/arch, RAM, free disk,
ffmpeg/bare/bun presence, @qvac/sdk installation) before runtime so users
get actionable hints instead of opaque crashes. Ships with human and --json
output, unit tests, bats smoke tests, and a published system-requirements.md.
Matches the industry-standard verb (brew doctor, flutter doctor, npm doctor)
and aligns with the pitch's naming so we can grow host + installed-model
checks under a single command without a follow-up breaking change.
…2239)

Addresses correctness gaps found in the initial doctor implementation.

- Rename `Platform / architecture` -> `CLI host`. The CLI itself runs on
  desktops only; mobile platforms are SDK *deploy* targets, not CLI
  hosts. Surface that distinction in labels, hints, and docs.

- Add a dedicated "Deploy targets (SDK)" section that reports the full
  SDK target matrix (DEFAULT_HOSTS in bundle-sdk/constants) split into
  Desktop (informational/always pass via bare-pack cross-bundling),
  Android (checks `adb --version`), and iOS (checks `xcodebuild -version`
  on macOS hosts; informational on non-macOS hosts).

- Use `os.availableMemory()` (Node 22+) instead of `os.freemem()` so the
  RAM check doesn't produce noisy false warnings on Linux/macOS where
  "free" under-reports reclaimable page cache.

- Fall back to POSIX `df -Pk` when `fs.statfsSync` is unavailable so the
  disk-space check actually runs on Node 18.0-18.14 hosts instead of
  silently skipping.

- Resolve `@qvac/sdk` with `createRequire(projectRoot)` + `require.resolve`
  so hoisted installs (monorepos, Yarn/Bun workspaces) are found
  correctly. Use DEFAULT_SDK_NAME from bundle-sdk/constants as the single
  source of truth.

- Introduce `info` CheckStatus + `informational` CheckSeverity for the
  iOS-on-non-darwin case, so it's clearly neither a warning nor a
  failure.

- Cover every new code path in unit tests (new hoisted-SDK test,
  all target-check permutations, info status handling).
@simon-iribarren simon-iribarren force-pushed the feat/qvac-12239-system-requirements-checker branch from 39edd0c to f22ffd1 Compare April 22, 2026 14:34
Comment thread packages/cli/src/doctor/checks.ts Outdated
Comment thread packages/cli/src/doctor/checks.ts Outdated
Comment thread packages/cli/src/doctor/checks.ts Outdated
Comment thread packages/cli/src/doctor/checks.ts Outdated
probeBinary uses spawnSync with no timeout, which means a misbehaving
binary on PATH (e.g. adb waiting for a device, an interactive --version
that stalls on TTY detection) can hang `qvac doctor` indefinitely.

Set timeout: 3000ms on the spawnSync call. 3s is generous for a
--version-style invocation while keeping the command snappy. Treat
SIGTERM explicitly as a failed probe so the tool is reported as
"not found" rather than silently hanging.

Addresses PR tetherto#1681 review comment from @NamelsKing.
The fail branch (< 2 GB) reported severity 'required' while the warn
and pass branches reported 'recommended'. Severity is meant to describe
the check itself, not the outcome of a specific branch — flipping it
makes report consumers that pivot by severity (e.g. "list all required
checks that passed") produce wrong results.

Total RAM has a hard gate (<2 GB fails the report) with a recommended
band on top, so the check is 'required' across all branches — the same
pattern checkNodeVersion already uses.

Also add a unit test locking this invariant across fail/warn/pass so
the inconsistency cannot silently come back.

Addresses PR tetherto#1681 review comment from @NamelsKing.
…rm Check interface (QVAC-12239)

Previously all checks lived in a single 493-line src/doctor/checks.ts
with ad-hoc signatures (checkNodeVersion(version?), checkFfmpeg(probe),
checkDesktopTargets(platform, arch), …). Adding a new check meant
inventing yet another signature and remembering to wire it into
collectCheckSections, and tests had to know which knobs each check
accepted.

Introduce a single abstraction:

  type Check = (ctx: CheckContext) => CheckResult

where CheckContext carries everything a check can read (projectRoot,
platform, arch, nodeVersion, totalMemoryBytes, availableMemoryBytes,
probe). createDefaultContext() reads the live host once; tests build a
context with exactly the overrides they need and get deterministic,
host-independent runs.

Files split out by report section:

  src/doctor/check.ts             -- CheckContext, Check, probeBinary, createDefaultContext
  src/doctor/checks/runtime.ts    -- node version, CLI host
  src/doctor/checks/hardware.ts   -- total/available RAM, free disk
  src/doctor/checks/targets.ts    -- desktop, android, ios
  src/doctor/checks/tools.ts      -- ffmpeg, bare, bun
  src/doctor/checks/project.ts    -- @qvac/sdk resolvable
  src/doctor/checks/index.ts      -- collectCheckSections, isReportOk, barrel re-exports

The orchestrator in src/doctor/index.ts is unchanged apart from the
import path. Existing behaviour and JSON shape are identical; this is
purely structural.

Tests are reworked to go through the new interface via a small makeCtx()
helper. Two new tests cover collectCheckSections({ context }) override
and createDefaultContext(). 39 unit tests + 36 bats smoke tests pass.

Addresses PR tetherto#1681 review comment from @lauripiisang.
QVAC inference backends use Metal on macOS (ggml-metal is hard-linked
in the whispercpp addon CMakeLists) and Vulkan on Linux/Windows (the
whisper-cpp vulkan feature and llama.cpp's Vulkan0/Vulkan1 device
enumeration). Running LLM or Whisper inference without a GPU backend
falls back to CPU, which is roughly an order of magnitude slower.

`qvac doctor` previously gave users no visibility into whether GPU
acceleration was actually wired up on their host, so they could spin
up inference, see slow CPU-only throughput, and have no way to diagnose
why.

Add a platform-aware GPU check under the Hardware section:

  - darwin    -> pass, "Metal (native macOS backend)"
                 (Metal is always present on supported macOS versions)
  - linux     -> probe `vulkaninfo --summary`
                 -> pass + extracted deviceName list (e.g. "NVIDIA RTX 3080")
                 -> warn with apt/dnf install hint when the ICD is missing
  - win32     -> probe `vulkaninfo --summary`
                 -> pass + deviceName list
                 -> warn with Vulkan SDK / GPU-driver hint when missing
  - other     -> info, "not checked on <platform>"

Severity is 'recommended' — missing GPU acceleration does not fail the
report (inference still runs on CPU), but a warning is surfaced with a
concrete remediation link so users can fix throughput proactively.

Implementation notes:

  - ProbeResult gains an optional `stdout` field so probes can return
    full multi-line output. The GPU check parses `deviceName = ...`
    lines to surface which GPUs the Vulkan loader sees.
  - The 3s probeBinary timeout applies automatically — a stuck
    vulkaninfo call will not hang `qvac doctor`.
  - 6 new unit tests cover darwin/linux/win32/unknown-platform paths,
    device-name extraction, and the no-device fallback.

Docs (README + system-requirements.md) updated.

Addresses PR tetherto#1681 review comment from @NamelsKing.
lauripiisang
lauripiisang previously approved these changes Apr 23, 2026
Victor-Rodzko
Victor-Rodzko previously approved these changes Apr 24, 2026
@simon-iribarren

Copy link
Copy Markdown
Contributor Author

/review

@github-actions

github-actions Bot commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

Tier-based Approval Status

**PR Tier:** TIER1

**Current Status:** ✅ APPROVED

**Requirements:**
- 1 Team Member approval ✅ (1/1)
- 1 Team Lead OR Management approval ✅ (1/1)



---
*This comment is automatically updated when reviews change.*

simon-iribarren added a commit that referenced this pull request Apr 28, 2026
…pe (#1775)

PR #1596 (Apr 15) updated the CLI's `sdkEmbed` for the new SDK 0.9+
embed() return shape — it changed from raw `Promise<number[] | number[][]>`
to wrapped `Promise<{ embedding, stats? }>`. The CLI's `sdkEmbed`
destructures `{ embedding }` from the result, then the OpenAI
embeddings route reads it as `embeddings[0]`.

That PR landed the runtime change but did not bump the
`@qvac/sdk` semver in `packages/cli/package.json`. It still requires
`^0.8.0`, which in npm semver means `>=0.8.0 <0.9.0`. CI installs
@qvac/sdk@0.8.0, whose `embed()` returns the raw vector array. The CLI
then destructures `{ embedding }` from a number array, gets `undefined`,
and the route handler crashes on `embeddings[0]`. The error is caught,
the response becomes a 500 with `{ error: { code: 'embed_error' } }`,
and the e2e tests asserting `.object == "list"` fail with no obvious
hint as to why:

  not ok 40 embeddings: single input returns vector       (e2e.bats:184)
  not ok 41 embeddings: batch input returns multiple vectors  (e2e.bats:198)

These two tests have been red on every CLI PR's CI since #1596 merged
(visible on PR #1681, #1766, and others); chat and transcription
tests are unaffected because their SDK contracts didn't change.

Bump `@qvac/sdk` to `^0.9.0` so the lockfile picks up 0.9.x at install
time, and bump the runtime `MIN_SDK_VERSION` constant to match. The
SDK 0.9 series is published on npm (0.9.0 and 0.9.1 both available).

Verified locally: rm -rf node_modules bun.lock && bun install resolves
@qvac/sdk to 0.9.1; full e2e suite now passes the embeddings tests:

  ok 6 embeddings: single input returns vector
  ok 7 embeddings: batch input returns multiple vectors
@simon-iribarren

Copy link
Copy Markdown
Contributor Author

/review

@simon-iribarren simon-iribarren merged commit d98d235 into tetherto:main Apr 28, 2026
6 checks passed
GustavoA1604 pushed a commit that referenced this pull request Apr 29, 2026
…pe (#1775)

PR #1596 (Apr 15) updated the CLI's `sdkEmbed` for the new SDK 0.9+
embed() return shape — it changed from raw `Promise<number[] | number[][]>`
to wrapped `Promise<{ embedding, stats? }>`. The CLI's `sdkEmbed`
destructures `{ embedding }` from the result, then the OpenAI
embeddings route reads it as `embeddings[0]`.

That PR landed the runtime change but did not bump the
`@qvac/sdk` semver in `packages/cli/package.json`. It still requires
`^0.8.0`, which in npm semver means `>=0.8.0 <0.9.0`. CI installs
@qvac/sdk@0.8.0, whose `embed()` returns the raw vector array. The CLI
then destructures `{ embedding }` from a number array, gets `undefined`,
and the route handler crashes on `embeddings[0]`. The error is caught,
the response becomes a 500 with `{ error: { code: 'embed_error' } }`,
and the e2e tests asserting `.object == "list"` fail with no obvious
hint as to why:

  not ok 40 embeddings: single input returns vector       (e2e.bats:184)
  not ok 41 embeddings: batch input returns multiple vectors  (e2e.bats:198)

These two tests have been red on every CLI PR's CI since #1596 merged
(visible on PR #1681, #1766, and others); chat and transcription
tests are unaffected because their SDK contracts didn't change.

Bump `@qvac/sdk` to `^0.9.0` so the lockfile picks up 0.9.x at install
time, and bump the runtime `MIN_SDK_VERSION` constant to match. The
SDK 0.9 series is published on npm (0.9.0 and 0.9.1 both available).

Verified locally: rm -rf node_modules bun.lock && bun install resolves
@qvac/sdk to 0.9.1; full e2e suite now passes the embeddings tests:

  ok 6 embeddings: single input returns vector
  ok 7 embeddings: batch input returns multiple vectors
GustavoA1604 pushed a commit that referenced this pull request Apr 29, 2026
… doc (#1681)

* feat[api]: add qvac check-system command and system-requirements doc (QVAC-12239)

Validates host preconditions (Node version, platform/arch, RAM, free disk,
ffmpeg/bare/bun presence, @qvac/sdk installation) before runtime so users
get actionable hints instead of opaque crashes. Ships with human and --json
output, unit tests, bats smoke tests, and a published system-requirements.md.

* mod: rename qvac check-system to qvac doctor (QVAC-12239)

Matches the industry-standard verb (brew doctor, flutter doctor, npm doctor)
and aligns with the pitch's naming so we can grow host + installed-model
checks under a single command without a follow-up breaking change.

* mod: expand qvac doctor checks for correctness and completion (QVAC-12239)

Addresses correctness gaps found in the initial doctor implementation.

- Rename `Platform / architecture` -> `CLI host`. The CLI itself runs on
  desktops only; mobile platforms are SDK *deploy* targets, not CLI
  hosts. Surface that distinction in labels, hints, and docs.

- Add a dedicated "Deploy targets (SDK)" section that reports the full
  SDK target matrix (DEFAULT_HOSTS in bundle-sdk/constants) split into
  Desktop (informational/always pass via bare-pack cross-bundling),
  Android (checks `adb --version`), and iOS (checks `xcodebuild -version`
  on macOS hosts; informational on non-macOS hosts).

- Use `os.availableMemory()` (Node 22+) instead of `os.freemem()` so the
  RAM check doesn't produce noisy false warnings on Linux/macOS where
  "free" under-reports reclaimable page cache.

- Fall back to POSIX `df -Pk` when `fs.statfsSync` is unavailable so the
  disk-space check actually runs on Node 18.0-18.14 hosts instead of
  silently skipping.

- Resolve `@qvac/sdk` with `createRequire(projectRoot)` + `require.resolve`
  so hoisted installs (monorepos, Yarn/Bun workspaces) are found
  correctly. Use DEFAULT_SDK_NAME from bundle-sdk/constants as the single
  source of truth.

- Introduce `info` CheckStatus + `informational` CheckSeverity for the
  iOS-on-non-darwin case, so it's clearly neither a warning nor a
  failure.

- Cover every new code path in unit tests (new hoisted-SDK test,
  all target-check permutations, info status handling).

* fix: cap doctor probeBinary at 3s to prevent hangs (QVAC-12239)

probeBinary uses spawnSync with no timeout, which means a misbehaving
binary on PATH (e.g. adb waiting for a device, an interactive --version
that stalls on TTY detection) can hang `qvac doctor` indefinitely.

Set timeout: 3000ms on the spawnSync call. 3s is generous for a
--version-style invocation while keeping the command snappy. Treat
SIGTERM explicitly as a failed probe so the tool is reported as
"not found" rather than silently hanging.

Addresses PR #1681 review comment from @NamelsKing.

* mod: normalize checkTotalMemory severity to 'required' (QVAC-12239)

The fail branch (< 2 GB) reported severity 'required' while the warn
and pass branches reported 'recommended'. Severity is meant to describe
the check itself, not the outcome of a specific branch — flipping it
makes report consumers that pivot by severity (e.g. "list all required
checks that passed") produce wrong results.

Total RAM has a hard gate (<2 GB fails the report) with a recommended
band on top, so the check is 'required' across all branches — the same
pattern checkNodeVersion already uses.

Also add a unit test locking this invariant across fail/warn/pass so
the inconsistency cannot silently come back.

Addresses PR #1681 review comment from @NamelsKing.

* refactor: split doctor checks into per-section modules behind a uniform Check interface (QVAC-12239)

Previously all checks lived in a single 493-line src/doctor/checks.ts
with ad-hoc signatures (checkNodeVersion(version?), checkFfmpeg(probe),
checkDesktopTargets(platform, arch), …). Adding a new check meant
inventing yet another signature and remembering to wire it into
collectCheckSections, and tests had to know which knobs each check
accepted.

Introduce a single abstraction:

  type Check = (ctx: CheckContext) => CheckResult

where CheckContext carries everything a check can read (projectRoot,
platform, arch, nodeVersion, totalMemoryBytes, availableMemoryBytes,
probe). createDefaultContext() reads the live host once; tests build a
context with exactly the overrides they need and get deterministic,
host-independent runs.

Files split out by report section:

  src/doctor/check.ts             -- CheckContext, Check, probeBinary, createDefaultContext
  src/doctor/checks/runtime.ts    -- node version, CLI host
  src/doctor/checks/hardware.ts   -- total/available RAM, free disk
  src/doctor/checks/targets.ts    -- desktop, android, ios
  src/doctor/checks/tools.ts      -- ffmpeg, bare, bun
  src/doctor/checks/project.ts    -- @qvac/sdk resolvable
  src/doctor/checks/index.ts      -- collectCheckSections, isReportOk, barrel re-exports

The orchestrator in src/doctor/index.ts is unchanged apart from the
import path. Existing behaviour and JSON shape are identical; this is
purely structural.

Tests are reworked to go through the new interface via a small makeCtx()
helper. Two new tests cover collectCheckSections({ context }) override
and createDefaultContext(). 39 unit tests + 36 bats smoke tests pass.

Addresses PR #1681 review comment from @lauripiisang.

* feat: add GPU acceleration check to qvac doctor (QVAC-12239)

QVAC inference backends use Metal on macOS (ggml-metal is hard-linked
in the whispercpp addon CMakeLists) and Vulkan on Linux/Windows (the
whisper-cpp vulkan feature and llama.cpp's Vulkan0/Vulkan1 device
enumeration). Running LLM or Whisper inference without a GPU backend
falls back to CPU, which is roughly an order of magnitude slower.

`qvac doctor` previously gave users no visibility into whether GPU
acceleration was actually wired up on their host, so they could spin
up inference, see slow CPU-only throughput, and have no way to diagnose
why.

Add a platform-aware GPU check under the Hardware section:

  - darwin    -> pass, "Metal (native macOS backend)"
                 (Metal is always present on supported macOS versions)
  - linux     -> probe `vulkaninfo --summary`
                 -> pass + extracted deviceName list (e.g. "NVIDIA RTX 3080")
                 -> warn with apt/dnf install hint when the ICD is missing
  - win32     -> probe `vulkaninfo --summary`
                 -> pass + deviceName list
                 -> warn with Vulkan SDK / GPU-driver hint when missing
  - other     -> info, "not checked on <platform>"

Severity is 'recommended' — missing GPU acceleration does not fail the
report (inference still runs on CPU), but a warning is surfaced with a
concrete remediation link so users can fix throughput proactively.

Implementation notes:

  - ProbeResult gains an optional `stdout` field so probes can return
    full multi-line output. The GPU check parses `deviceName = ...`
    lines to surface which GPUs the Vulkan loader sees.
  - The 3s probeBinary timeout applies automatically — a stuck
    vulkaninfo call will not hang `qvac doctor`.
  - 6 new unit tests cover darwin/linux/win32/unknown-platform paths,
    device-name extraction, and the no-device fallback.

Docs (README + system-requirements.md) updated.

Addresses PR #1681 review comment from @NamelsKing.
Proletter pushed a commit that referenced this pull request May 24, 2026
…pe (#1775)

PR #1596 (Apr 15) updated the CLI's `sdkEmbed` for the new SDK 0.9+
embed() return shape — it changed from raw `Promise<number[] | number[][]>`
to wrapped `Promise<{ embedding, stats? }>`. The CLI's `sdkEmbed`
destructures `{ embedding }` from the result, then the OpenAI
embeddings route reads it as `embeddings[0]`.

That PR landed the runtime change but did not bump the
`@qvac/sdk` semver in `packages/cli/package.json`. It still requires
`^0.8.0`, which in npm semver means `>=0.8.0 <0.9.0`. CI installs
@qvac/sdk@0.8.0, whose `embed()` returns the raw vector array. The CLI
then destructures `{ embedding }` from a number array, gets `undefined`,
and the route handler crashes on `embeddings[0]`. The error is caught,
the response becomes a 500 with `{ error: { code: 'embed_error' } }`,
and the e2e tests asserting `.object == "list"` fail with no obvious
hint as to why:

  not ok 40 embeddings: single input returns vector       (e2e.bats:184)
  not ok 41 embeddings: batch input returns multiple vectors  (e2e.bats:198)

These two tests have been red on every CLI PR's CI since #1596 merged
(visible on PR #1681, #1766, and others); chat and transcription
tests are unaffected because their SDK contracts didn't change.

Bump `@qvac/sdk` to `^0.9.0` so the lockfile picks up 0.9.x at install
time, and bump the runtime `MIN_SDK_VERSION` constant to match. The
SDK 0.9 series is published on npm (0.9.0 and 0.9.1 both available).

Verified locally: rm -rf node_modules bun.lock && bun install resolves
@qvac/sdk to 0.9.1; full e2e suite now passes the embeddings tests:

  ok 6 embeddings: single input returns vector
  ok 7 embeddings: batch input returns multiple vectors
Proletter pushed a commit that referenced this pull request May 24, 2026
… doc (#1681)

* feat[api]: add qvac check-system command and system-requirements doc (QVAC-12239)

Validates host preconditions (Node version, platform/arch, RAM, free disk,
ffmpeg/bare/bun presence, @qvac/sdk installation) before runtime so users
get actionable hints instead of opaque crashes. Ships with human and --json
output, unit tests, bats smoke tests, and a published system-requirements.md.

* mod: rename qvac check-system to qvac doctor (QVAC-12239)

Matches the industry-standard verb (brew doctor, flutter doctor, npm doctor)
and aligns with the pitch's naming so we can grow host + installed-model
checks under a single command without a follow-up breaking change.

* mod: expand qvac doctor checks for correctness and completion (QVAC-12239)

Addresses correctness gaps found in the initial doctor implementation.

- Rename `Platform / architecture` -> `CLI host`. The CLI itself runs on
  desktops only; mobile platforms are SDK *deploy* targets, not CLI
  hosts. Surface that distinction in labels, hints, and docs.

- Add a dedicated "Deploy targets (SDK)" section that reports the full
  SDK target matrix (DEFAULT_HOSTS in bundle-sdk/constants) split into
  Desktop (informational/always pass via bare-pack cross-bundling),
  Android (checks `adb --version`), and iOS (checks `xcodebuild -version`
  on macOS hosts; informational on non-macOS hosts).

- Use `os.availableMemory()` (Node 22+) instead of `os.freemem()` so the
  RAM check doesn't produce noisy false warnings on Linux/macOS where
  "free" under-reports reclaimable page cache.

- Fall back to POSIX `df -Pk` when `fs.statfsSync` is unavailable so the
  disk-space check actually runs on Node 18.0-18.14 hosts instead of
  silently skipping.

- Resolve `@qvac/sdk` with `createRequire(projectRoot)` + `require.resolve`
  so hoisted installs (monorepos, Yarn/Bun workspaces) are found
  correctly. Use DEFAULT_SDK_NAME from bundle-sdk/constants as the single
  source of truth.

- Introduce `info` CheckStatus + `informational` CheckSeverity for the
  iOS-on-non-darwin case, so it's clearly neither a warning nor a
  failure.

- Cover every new code path in unit tests (new hoisted-SDK test,
  all target-check permutations, info status handling).

* fix: cap doctor probeBinary at 3s to prevent hangs (QVAC-12239)

probeBinary uses spawnSync with no timeout, which means a misbehaving
binary on PATH (e.g. adb waiting for a device, an interactive --version
that stalls on TTY detection) can hang `qvac doctor` indefinitely.

Set timeout: 3000ms on the spawnSync call. 3s is generous for a
--version-style invocation while keeping the command snappy. Treat
SIGTERM explicitly as a failed probe so the tool is reported as
"not found" rather than silently hanging.

Addresses PR #1681 review comment from @NamelsKing.

* mod: normalize checkTotalMemory severity to 'required' (QVAC-12239)

The fail branch (< 2 GB) reported severity 'required' while the warn
and pass branches reported 'recommended'. Severity is meant to describe
the check itself, not the outcome of a specific branch — flipping it
makes report consumers that pivot by severity (e.g. "list all required
checks that passed") produce wrong results.

Total RAM has a hard gate (<2 GB fails the report) with a recommended
band on top, so the check is 'required' across all branches — the same
pattern checkNodeVersion already uses.

Also add a unit test locking this invariant across fail/warn/pass so
the inconsistency cannot silently come back.

Addresses PR #1681 review comment from @NamelsKing.

* refactor: split doctor checks into per-section modules behind a uniform Check interface (QVAC-12239)

Previously all checks lived in a single 493-line src/doctor/checks.ts
with ad-hoc signatures (checkNodeVersion(version?), checkFfmpeg(probe),
checkDesktopTargets(platform, arch), …). Adding a new check meant
inventing yet another signature and remembering to wire it into
collectCheckSections, and tests had to know which knobs each check
accepted.

Introduce a single abstraction:

  type Check = (ctx: CheckContext) => CheckResult

where CheckContext carries everything a check can read (projectRoot,
platform, arch, nodeVersion, totalMemoryBytes, availableMemoryBytes,
probe). createDefaultContext() reads the live host once; tests build a
context with exactly the overrides they need and get deterministic,
host-independent runs.

Files split out by report section:

  src/doctor/check.ts             -- CheckContext, Check, probeBinary, createDefaultContext
  src/doctor/checks/runtime.ts    -- node version, CLI host
  src/doctor/checks/hardware.ts   -- total/available RAM, free disk
  src/doctor/checks/targets.ts    -- desktop, android, ios
  src/doctor/checks/tools.ts      -- ffmpeg, bare, bun
  src/doctor/checks/project.ts    -- @qvac/sdk resolvable
  src/doctor/checks/index.ts      -- collectCheckSections, isReportOk, barrel re-exports

The orchestrator in src/doctor/index.ts is unchanged apart from the
import path. Existing behaviour and JSON shape are identical; this is
purely structural.

Tests are reworked to go through the new interface via a small makeCtx()
helper. Two new tests cover collectCheckSections({ context }) override
and createDefaultContext(). 39 unit tests + 36 bats smoke tests pass.

Addresses PR #1681 review comment from @lauripiisang.

* feat: add GPU acceleration check to qvac doctor (QVAC-12239)

QVAC inference backends use Metal on macOS (ggml-metal is hard-linked
in the whispercpp addon CMakeLists) and Vulkan on Linux/Windows (the
whisper-cpp vulkan feature and llama.cpp's Vulkan0/Vulkan1 device
enumeration). Running LLM or Whisper inference without a GPU backend
falls back to CPU, which is roughly an order of magnitude slower.

`qvac doctor` previously gave users no visibility into whether GPU
acceleration was actually wired up on their host, so they could spin
up inference, see slow CPU-only throughput, and have no way to diagnose
why.

Add a platform-aware GPU check under the Hardware section:

  - darwin    -> pass, "Metal (native macOS backend)"
                 (Metal is always present on supported macOS versions)
  - linux     -> probe `vulkaninfo --summary`
                 -> pass + extracted deviceName list (e.g. "NVIDIA RTX 3080")
                 -> warn with apt/dnf install hint when the ICD is missing
  - win32     -> probe `vulkaninfo --summary`
                 -> pass + deviceName list
                 -> warn with Vulkan SDK / GPU-driver hint when missing
  - other     -> info, "not checked on <platform>"

Severity is 'recommended' — missing GPU acceleration does not fail the
report (inference still runs on CPU), but a warning is surfaced with a
concrete remediation link so users can fix throughput proactively.

Implementation notes:

  - ProbeResult gains an optional `stdout` field so probes can return
    full multi-line output. The GPU check parses `deviceName = ...`
    lines to surface which GPUs the Vulkan loader sees.
  - The 3s probeBinary timeout applies automatically — a stuck
    vulkaninfo call will not hang `qvac doctor`.
  - 6 new unit tests cover darwin/linux/win32/unknown-platform paths,
    device-name extraction, and the no-device fallback.

Docs (README + system-requirements.md) updated.

Addresses PR #1681 review comment from @NamelsKing.
Proletter pushed a commit that referenced this pull request May 24, 2026
…pe (#1775)

PR #1596 (Apr 15) updated the CLI's `sdkEmbed` for the new SDK 0.9+
embed() return shape — it changed from raw `Promise<number[] | number[][]>`
to wrapped `Promise<{ embedding, stats? }>`. The CLI's `sdkEmbed`
destructures `{ embedding }` from the result, then the OpenAI
embeddings route reads it as `embeddings[0]`.

That PR landed the runtime change but did not bump the
`@qvac/sdk` semver in `packages/cli/package.json`. It still requires
`^0.8.0`, which in npm semver means `>=0.8.0 <0.9.0`. CI installs
@qvac/sdk@0.8.0, whose `embed()` returns the raw vector array. The CLI
then destructures `{ embedding }` from a number array, gets `undefined`,
and the route handler crashes on `embeddings[0]`. The error is caught,
the response becomes a 500 with `{ error: { code: 'embed_error' } }`,
and the e2e tests asserting `.object == "list"` fail with no obvious
hint as to why:

  not ok 40 embeddings: single input returns vector       (e2e.bats:184)
  not ok 41 embeddings: batch input returns multiple vectors  (e2e.bats:198)

These two tests have been red on every CLI PR's CI since #1596 merged
(visible on PR #1681, #1766, and others); chat and transcription
tests are unaffected because their SDK contracts didn't change.

Bump `@qvac/sdk` to `^0.9.0` so the lockfile picks up 0.9.x at install
time, and bump the runtime `MIN_SDK_VERSION` constant to match. The
SDK 0.9 series is published on npm (0.9.0 and 0.9.1 both available).

Verified locally: rm -rf node_modules bun.lock && bun install resolves
@qvac/sdk to 0.9.1; full e2e suite now passes the embeddings tests:

  ok 6 embeddings: single input returns vector
  ok 7 embeddings: batch input returns multiple vectors
Proletter pushed a commit that referenced this pull request May 24, 2026
… doc (#1681)

* feat[api]: add qvac check-system command and system-requirements doc (QVAC-12239)

Validates host preconditions (Node version, platform/arch, RAM, free disk,
ffmpeg/bare/bun presence, @qvac/sdk installation) before runtime so users
get actionable hints instead of opaque crashes. Ships with human and --json
output, unit tests, bats smoke tests, and a published system-requirements.md.

* mod: rename qvac check-system to qvac doctor (QVAC-12239)

Matches the industry-standard verb (brew doctor, flutter doctor, npm doctor)
and aligns with the pitch's naming so we can grow host + installed-model
checks under a single command without a follow-up breaking change.

* mod: expand qvac doctor checks for correctness and completion (QVAC-12239)

Addresses correctness gaps found in the initial doctor implementation.

- Rename `Platform / architecture` -> `CLI host`. The CLI itself runs on
  desktops only; mobile platforms are SDK *deploy* targets, not CLI
  hosts. Surface that distinction in labels, hints, and docs.

- Add a dedicated "Deploy targets (SDK)" section that reports the full
  SDK target matrix (DEFAULT_HOSTS in bundle-sdk/constants) split into
  Desktop (informational/always pass via bare-pack cross-bundling),
  Android (checks `adb --version`), and iOS (checks `xcodebuild -version`
  on macOS hosts; informational on non-macOS hosts).

- Use `os.availableMemory()` (Node 22+) instead of `os.freemem()` so the
  RAM check doesn't produce noisy false warnings on Linux/macOS where
  "free" under-reports reclaimable page cache.

- Fall back to POSIX `df -Pk` when `fs.statfsSync` is unavailable so the
  disk-space check actually runs on Node 18.0-18.14 hosts instead of
  silently skipping.

- Resolve `@qvac/sdk` with `createRequire(projectRoot)` + `require.resolve`
  so hoisted installs (monorepos, Yarn/Bun workspaces) are found
  correctly. Use DEFAULT_SDK_NAME from bundle-sdk/constants as the single
  source of truth.

- Introduce `info` CheckStatus + `informational` CheckSeverity for the
  iOS-on-non-darwin case, so it's clearly neither a warning nor a
  failure.

- Cover every new code path in unit tests (new hoisted-SDK test,
  all target-check permutations, info status handling).

* fix: cap doctor probeBinary at 3s to prevent hangs (QVAC-12239)

probeBinary uses spawnSync with no timeout, which means a misbehaving
binary on PATH (e.g. adb waiting for a device, an interactive --version
that stalls on TTY detection) can hang `qvac doctor` indefinitely.

Set timeout: 3000ms on the spawnSync call. 3s is generous for a
--version-style invocation while keeping the command snappy. Treat
SIGTERM explicitly as a failed probe so the tool is reported as
"not found" rather than silently hanging.

Addresses PR #1681 review comment from @NamelsKing.

* mod: normalize checkTotalMemory severity to 'required' (QVAC-12239)

The fail branch (< 2 GB) reported severity 'required' while the warn
and pass branches reported 'recommended'. Severity is meant to describe
the check itself, not the outcome of a specific branch — flipping it
makes report consumers that pivot by severity (e.g. "list all required
checks that passed") produce wrong results.

Total RAM has a hard gate (<2 GB fails the report) with a recommended
band on top, so the check is 'required' across all branches — the same
pattern checkNodeVersion already uses.

Also add a unit test locking this invariant across fail/warn/pass so
the inconsistency cannot silently come back.

Addresses PR #1681 review comment from @NamelsKing.

* refactor: split doctor checks into per-section modules behind a uniform Check interface (QVAC-12239)

Previously all checks lived in a single 493-line src/doctor/checks.ts
with ad-hoc signatures (checkNodeVersion(version?), checkFfmpeg(probe),
checkDesktopTargets(platform, arch), …). Adding a new check meant
inventing yet another signature and remembering to wire it into
collectCheckSections, and tests had to know which knobs each check
accepted.

Introduce a single abstraction:

  type Check = (ctx: CheckContext) => CheckResult

where CheckContext carries everything a check can read (projectRoot,
platform, arch, nodeVersion, totalMemoryBytes, availableMemoryBytes,
probe). createDefaultContext() reads the live host once; tests build a
context with exactly the overrides they need and get deterministic,
host-independent runs.

Files split out by report section:

  src/doctor/check.ts             -- CheckContext, Check, probeBinary, createDefaultContext
  src/doctor/checks/runtime.ts    -- node version, CLI host
  src/doctor/checks/hardware.ts   -- total/available RAM, free disk
  src/doctor/checks/targets.ts    -- desktop, android, ios
  src/doctor/checks/tools.ts      -- ffmpeg, bare, bun
  src/doctor/checks/project.ts    -- @qvac/sdk resolvable
  src/doctor/checks/index.ts      -- collectCheckSections, isReportOk, barrel re-exports

The orchestrator in src/doctor/index.ts is unchanged apart from the
import path. Existing behaviour and JSON shape are identical; this is
purely structural.

Tests are reworked to go through the new interface via a small makeCtx()
helper. Two new tests cover collectCheckSections({ context }) override
and createDefaultContext(). 39 unit tests + 36 bats smoke tests pass.

Addresses PR #1681 review comment from @lauripiisang.

* feat: add GPU acceleration check to qvac doctor (QVAC-12239)

QVAC inference backends use Metal on macOS (ggml-metal is hard-linked
in the whispercpp addon CMakeLists) and Vulkan on Linux/Windows (the
whisper-cpp vulkan feature and llama.cpp's Vulkan0/Vulkan1 device
enumeration). Running LLM or Whisper inference without a GPU backend
falls back to CPU, which is roughly an order of magnitude slower.

`qvac doctor` previously gave users no visibility into whether GPU
acceleration was actually wired up on their host, so they could spin
up inference, see slow CPU-only throughput, and have no way to diagnose
why.

Add a platform-aware GPU check under the Hardware section:

  - darwin    -> pass, "Metal (native macOS backend)"
                 (Metal is always present on supported macOS versions)
  - linux     -> probe `vulkaninfo --summary`
                 -> pass + extracted deviceName list (e.g. "NVIDIA RTX 3080")
                 -> warn with apt/dnf install hint when the ICD is missing
  - win32     -> probe `vulkaninfo --summary`
                 -> pass + deviceName list
                 -> warn with Vulkan SDK / GPU-driver hint when missing
  - other     -> info, "not checked on <platform>"

Severity is 'recommended' — missing GPU acceleration does not fail the
report (inference still runs on CPU), but a warning is surfaced with a
concrete remediation link so users can fix throughput proactively.

Implementation notes:

  - ProbeResult gains an optional `stdout` field so probes can return
    full multi-line output. The GPU check parses `deviceName = ...`
    lines to surface which GPUs the Vulkan loader sees.
  - The 3s probeBinary timeout applies automatically — a stuck
    vulkaninfo call will not hang `qvac doctor`.
  - 6 new unit tests cover darwin/linux/win32/unknown-platform paths,
    device-name extraction, and the no-device fallback.

Docs (README + system-requirements.md) updated.

Addresses PR #1681 review comment from @NamelsKing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants